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Abstract — Content based Image/Video Retrieval system is a 
querying system that uses content as a key for the retrieval 
process. It is a difficult task to design an automatic retrieval 
system because real world images usually contain very complex 
objects and color information. In this paper, we discuss some of 
the key contributions in the current decade related to image 
retrieval and automated image annotation. We also discuss 
some of the key challenges involved in the adaptation of 
existing image retrieval techniques to build useful systems that 
can handle real-world data, so nowadays the content based 
image retrieval are becoming a source of exact and fast 
retrieval. In this paper the techniques of content based image 
retrieval are discussed, analyzed and compared. It also 
introduced the feature like visual descriptor and ontology 
methods. The suggestion for feature methodology's to over 
come the difficulties and improve the result performance. In 
this paper we provide an overview of approaches to CBIR. 
Major approaches to improving retrieval effectiveness via 
relevance feedback in text retrieval systems are discussed 
Index Terms: Inference mechanisms, multimedia databases, 
Content based image retrieval, Visual descriptor, ontology. 


I. Introduction 

It is the application of computer vision techniques 
to the image retrieval problem (i.e.) the problem of searching 
for digital images in large databases. An image retrieval 
system is a computer system for browsing, searching and 
retrieving images from a large database of digital images. 
Color, Shape and texture are important cue in extracting 
information from images; these histograms are widely used 
in content based image retrieval [13]. Color and texture 
contain important information but, for instance, two images 
with similar color histograms can represent very different 
things. Therefore the use of shape-describing features is 
essential in an efficient content-based image retrieval system. 
Although shape description has been intensively researched, 
there exists no direct answer as to which kind of shape 
features should be incorporated into such a system [14]. Most 
traditional and common methods of image retrieval utilize 
some method of adding metadata such 
as captioning, keywords or descriptions to the images so that 
retrieval can be performed over the annotation words. 
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Image retrieval has been a very active research area since the 
1970s, with the thrust from two major research communities, 
database management and computer vision. These two 
research communities study image retrieval from different 
angles, one being text-based and the other visual-based [15]. 
The fundamental difference between content-based and 
text-based retrieval systems is that the human interaction is 
an indispensable part of the latter system. Humans tend to use 
high-level features such as keywords, text descriptors, to 
interpret images and measure their similarity. While the 
features automatically extracted using computer vision 
techniques are mostly low-level features [16]. Early 
techniques of image retrieval were based on the manual 
textual annotation of images, a cumbersome and also often a 
subjective task. Texts alone are not sufficient because of the 
fact that interpretation of what we see is hard to characterize 
by them. Hence, contents in an image, color, shape, and 
texture, started gaining prominence [17]. The large amount 
of manual effort required in developing the annotations, the 
differences in interpretation of image contents, and 
inconsistency of the keyword assignments among different 
indexers. As the size of image repositories increases, the 
keyword annotation approach becomes infeasible. To 
overcome the difficulties an alternative mechanism. Content 
Based Image Retrieval is used [18]. 

Biomedical images are frequently used in 
publications to illustrate the medical concepts or to highlight 
special cases. Conventional approaches for biomedical 
journal article retrieval have been text-based with little 
attention devoted to the use of images in the articles. 
Text-based retrieval uses text information automatically 
extracted from title, abstract, figure captions, and discussions 
(mention). It provides fairly good results; however, the 
relevance quality sometimes is not satisfactory. 
Content-based image retrieval (CBIR) also has been 
applied to biomedical image retrieval [19]. 

Clinicians and medical researchers routinely use 
online databases such as MEDLINE to search for 
bibliographic citations that are relevant to a clinical 
situation. The biomedical evidence they seek is available 
through clinical decision support systems (CDSS) that use 
text-based retrieval enhanced with biomedical concepts 
Clinicians and medical researchers routinely use online 
databases such as MEDLINE to search for bibliographic 
citations that are relevant to a clinical situation. The 
biomedical evidence they seek is available through clinical 
decision support systems (CDSS) that use text-based retrieval 
enhanced with biomedical concepts. Authors of biomedical 
publications frequently use images to illustrate the medical 
concepts or to highlight special cases. These images often 
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convey essential information and can be very valuable for 
improved clinical decision support (CDS) and education. 
The text-based retrieval of the images has, so far, been 
limited mostly to caption and/or citation information. To be 
of greater value, images in scientific publications need to be 
first annotated (preferably, automatically) with respect to 
their usefulness for CDS to help determine relevance to a 
clinical query or to queries for special cases important in 
educational settings [1-3]. 

This article discusses a method for multimodal image 
annotation that utilizes (i) image analysis techniques for 
localization and recognition of author provided overlays on 
the images; (ii) image feature extraction methods for 
content-based image retrieval (CBIR); (iii) natural language 
processing techniques for identifying key terms in the title, 
abstract, figure caption, and figure citation (mention) in the 
article; an d (iv) use of structured vocabularies, such as the 
National Library of Medicine’s Unified Medical Language 
System (UMLS ), for identifying the biomedical concepts in 
the text[3]. 

As discussed in earlier works [4], these steps can be 
used to associate the biomedical concepts in the text to 
specific regions in the image. The relevance to a clinical 
query is aided by this addition of semantic information to 
extracted image features for improved CBIR. Traditionally, 
CBIR tends to be limited to use of visual features in 
identifying similarity among a collection of images. This has 
spurred discussion on the “semantic gap” [5] that is 
introduced when high-level concepts are represented through 
low-level visual features such as image color, and texture (for 
example). Such a semantic gap can be minimized through 
annotation by biomedical concepts that are extracted from 
the article text and applied to relevant regions within an 
image. 

General content-based image retrieval (CBIR) also 
could be improved by the proposed approach in a similar 
manner as text-based retrieval is improved. In this case no 
text information is available, but only visual features are 
used. The CBIR identifies relevant articles as text-based 
retrieval does in the multimodal method. Annotations and 
ROIs in retrieved images can be identified by the annotation 
recognizer and then be used to re-rank the results [3], 

At present, images needed for instructional 
purposes or clinical decision support (CDS) appear in 
specialized databases or in biomedical publications and are 
not meaningfully retrievable using primarily text-based 
retrieval systems. Our goal is to automatically annotate 
images extracted from scientific publications with respect to 
their usefulness for CDS. A future clinical decision support 
system (CDSS) could then provide images relevant to a 
clinical query or to queries for special cases important in 
educational settings. An important step toward attaining the 
goal is automatically annotating images and related text. Our 
approach to automatic image indexing is to describe (or 
annotate) an image at three levels of granularity: 1 .coarse, 
which addresses, a)image modality b).relation to a specific 
clinical task (image utility),c).body ocation;2.medium,which 
provides a more detailed description of the image using 
biomedical domain ontologies;3.spesific,which provides 


very detailed description of clinical entities and events in an 
image using terms that are not included in existing 
ontologies, and often are familiar only to clinicians 
specializing in a narrow area of medicine[7]. 

CBIR involves the following four parts in 
system realization: data collection, build up feature database, 
search in the database, arrange the order and deal with the 
results of the retrieval. Fig 1 represents the Block Diagram of 
CBIR system. 
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Fig. 1. Block Diagram of CBIR system 


H. BRIEF SURVEY ON IR SYSTEM 

Color feature is the most intuitive 
and obvious feature of the image, and generally adopt 
histograms to describe it. Color histograms method has the 
advantages of speediness, low demand of memory space and 
not sensitive with the images’ changes of the size and 
rotation, it wins extensive attention consequently. The 
retrieval based on texture feature is refers to the description 
of the image’s texture, we usually adopt texture’s statistic 
feature and structure feature as well as the features that based 
on special domain are changed into frequency domain. 
There is three problems need to be solved during the image 
retrieval that based on shape feature. Firstly, shape usually 
related to the specifically object in the image, so shape’s 
semantic feature is stronger than texture. [10] 

The retrieval based on 
annotation is a set of statical models are built based on visual 
features, manually labeled images to represent to used to 
propagate keywords to other unlabeled images. The retrieval 
based on Ontology is a combination of some special ontology 
visual descriptors to classify the images and find the query 
views and object views to compare the databases. Search to 
classify the resultant images is divided in to relevant group 
and irrelevant group of images. 


III. IMPROVING RETRIEVAL EFFECTIVENESS IN IR 
SYSTEMS 
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In contrast with the database environment, precise 
representations for user queries and (text) documents are 
difficult to generate in an IR environment. Retrieval 
effectiveness is improved by starting out with an imprecise 
and incomplete query and iteratively and incrementally 
improve the query specification [13]. There are two major 
approaches to improving retrieval effectiveness: automated 
query expansion, and relevance feedback techniques. 

A. Automated Query Expansion in IR 

Automated query expansion methods are based on term 
co-occurrences [1], Pseudo-Relevance Feedback (PRF) [1], 
concept-based retrieval [14], and language analysis [6]. 
Language analysis based query expansion methods are not 
discussed in this paper. 

B. Relevance Feedback Techniques in CBIR 

There has been numerous studies on improving 
retrieval effectiveness in CBIR systems based on relevance 
feedback (RF) techniques mostly adopted from IR area. In the 
following we describe salient aspects of some of these 
approaches. These studies use their own test collections, 
queries, training data. Therefore, it is rather difficult to 
provide a comparative assessment of these approaches. 

RF techniques assume two-class relevance 
feedback: relevant and non-relevant classes. For example. 
Support Vector Machines (SVM) have been used to 
discriminate between “relevant and “non-relevant images. In 
the context of CBIR, SVMs first map image signatures (i.e., 
n-dimensional ectors whose components corresponds to 
low-level image features) to a higher-dimensional feature 
space (HDFS) using a non-linear transformation associated 
to a kernel, and then implicitly perform linear discrimination 
between “relevant and “non-relevant items in this HDFS. 
Retrieval user is emulated according to seven significantly 
different strategies on four ground-truth databases of 
different complexity. The study concludes that the ranking of 
the images by these two algorithms don’t significantly 
depend on the selected strategy. Moreover, the ranking 
between strategies appears to be independent of the 
complexity of the ground-truth classes. Peng (2003) propose 
a multi-class form of relevance feedback retrieval to offset 
disadvantages of two-class relevance feedback. It is shown 
that this method is able to create flexible metrics that better 
capture users’ perceived similarity. The method achieves a 
higher level of precision with fewer iterations. Fang and 
Hock (2000) describe a system for CBIR based on 
multidimensional features associated with color, texture, and 
shape. They observe that by co-jointly matching image 
features in a multidimensional space rather than in separate 
independent feature spaces, the precision in retrieval is 
improved from 50% to 90% for the top 10 most similar 
images retrieved. This study has also shown that by including 
the features corresponding to the image background entails 
improvement in retrieval precision. The system also employs 
interactive relevance feedback to improve user query 
specification and retrieval effectiveness. For each retrieval 
iteration, the system learns a decision tree to discover 
commonality among a set of images considered as relevant by 
the user for a query. The tree is then used as a model for 
determining which of the unseen images would be of interest 


to the user. Some researchers use a Bayesian learning 
algorithm that relies on belief propagation to integrate 
relevance feedback provided by the user over a retrieval 
session. This approach entails natural criteria for evaluating 
local image similarity without the need for image 
segmentation. They note that region-based queries are 
considerably less ambiguous than queries based on entire 
images, and hence entail significant improvements in 
retrieval precision. Through experimental results, they 
demonstrate significant improvements in the rate of 
convergence to the relevant images is possible by the 
inclusion of learning in the retrieval process. 

Relevance feedback approaches to CBIR based 
on support vector machine (SVM) learning have been shown 
to significantly improve retrieval performance. These 
approaches require fixed-length image representations — 
SVM kernels represent an inner product in a feature space 
that is a nonlinear transformation of the input space. 
However, region-based CBIR approaches typically use 
variable length image representation and define a similarity 
measure between two variable length representations. 
Therefore, standard SVM approach cannot be applied to 
region-based CBIR. This is where generalized SVM 
(GSVM) comes to the rescue. It allows the use of an arbitrary 
kernel.Gondra and Heisterkamp (2004) describe an initial 
investigation into utilizing a GSVM-based relevance 
feedback learning algorithm, which learns One-class 
Support Vector Machines (1SVM). Based on experimental 
results, the study concludes that the learning algorithm 
improves retrieval effectiveness. They present an improved 
version of this work that uses an incremental k-means 
algorithm to cluster lSVMs in [16]. This version results in 
scalability and query processing is accelerated by considering 
only a small number of cluster representatives, rather than 
the entire set of accumulated lSVMs. 

Zhang and Zhang (2004) study relevance feedback 
in CBIR as a standard two-class pattern classification 
problem with the goal of improving retrieval precision by 
learning through the user relevance feedback data. They have 
investigated two important unique characteristics of the 
problem: small sample collection, and asymmetric sample 
distributions between positive and negative samples. They 
address this problem by leveraging these two unique 
characteristics. Different learning strategies are used for 
positive and negative sample collections. Su, Zhang, Li, and 
Ma (2003) propose an approach to relevance feedback based 
CBIR using a Bayesian classifier. Positive examples in the 
feedback are used to estimate a Gaussian distribution that 
represents the desired images for a given query. Ranking of 
retrieved images is determined based on the negative 
examples 

In the relevance feedback. Furthermore, using 
relevance feedback and Principal Component Analysis 
(PCA) technique feature subspace is extracted and updated 
during the feedback process. This entails not only reduction 
in dimensionality of feature spaces, but also enables 
obtaining a proper subspace for each feature type to further 
enhance retrieval effectiveness. Chua, Chu, and Kankanhalli 
(1999) [11] propose a relevance feedback approach to CBIR 
by using text and color attributes of images. A pseudo object 


35 


www.erpublication.org 


A Techniques for CBIR System Based on Image Annotations and Multimodal feature Set 


model based on color coherence vector is used to model color 
content. The approach uses user relevance feedback to 
estimate the importance of different attributes. Based on 
experimental results on a collection of 12,000 images, the 
study concludes that relevance feedback and pseudo-object 
based color model are effective in improving retrieval 
performance. Benitez, Beigi, and Chang (1998) describe 
Meta Seek, which is a meta search engine to query 
distributed image collections on the Web. The meta search 
engine interfaces with four image search engines: Visual 
Seek, WebSeek, QBIC, and Virage. User feedback is used to 
evaluate the quality of search results returned by each engine, 
and this history is preserved in a database. Rui. Huang, and 
Mehrotra (1998) proposes an approach to CBIR which 
addresses: the gap between high level concepts and low level 
image features; and, subjectivity in human perception of 
image content. Using user’s relevance feedback, query term 
weights are dynamically adjusted to improve retrieval 
effectiveness. Experimental results on a collection of size 
70,000 indicate that the approach significantly reduces user’s 
effort needed in specifying queries. 

IV. CONCLUSION: 

CBIR systems based on these approaches are expensive 
to develop and maintain due to extensive human labor, and 
consistency and subjectivity concerns in indexing and query 
specification. Advances that led to commercial success in IR 
area (e.g., Web search engines) present a great potential for 
such a success in CBIR also. The ubiquitous and intense 
research interest among CBIR researchers in leveraging 
lexicons, thesauri, and ontologies to effect concept-based 
retrieval using relevance feedback is a positive direction in 
benefiting from the IR advances. However, CBIR evaluation 
frameworks with elaborate benchmarks — test collections, 
representative user queries, relevance judgments, system 
evaluation measures and methods — are essential for making 
rapid strides in CBIR. 

The following three-phased approach 
seems to hold promise for querying generic (i.e., non 
domain-specific) image collections by casual users. First, 
ontology-guided browsing (i.e., retrieval by browsing) is 
needed to make the user develop a conceptual understanding 
of the collection and its semantic dimensions. Using this 
knowledge, in the next phase, the user will specify queries 
that reflect his information need more closely with minimal 
effort. The user then engages in incremental query 
refinement and iterative retrieval by providing relevance 
feedback with the goal of improving precision. Once the user 
retrieves a few images of high relevance, he moves on to the 
third stage, in which he performs retrieval by example using 
these images. The goal in the third stage is to improve recall 
by not losing on precision. 
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